AITopics | breakthrough performance

Collaborating Authors

breakthrough performance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PaLM: Scaling Language Modeling with Pathways

Chowdhery, Aakanksha, Narang, Sharan, Devlin, Jacob, Bosma, Maarten, Mishra, Gaurav, Roberts, Adam, Barham, Paul, Chung, Hyung Won, Sutton, Charles, Gehrmann, Sebastian, Schuh, Parker, Shi, Kensen, Tsvyashchenko, Sasha, Maynez, Joshua, Rao, Abhishek, Barnes, Parker, Tay, Yi, Shazeer, Noam, Prabhakaran, Vinodkumar, Reif, Emily, Du, Nan, Hutchinson, Ben, Pope, Reiner, Bradbury, James, Austin, Jacob, Isard, Michael, Gur-Ari, Guy, Yin, Pengcheng, Duke, Toju, Levskaya, Anselm, Ghemawat, Sanjay, Dev, Sunipa, Michalewski, Henryk, Garcia, Xavier, Misra, Vedant, Robinson, Kevin, Fedus, Liam, Zhou, Denny, Ippolito, Daphne, Luan, David, Lim, Hyeontaek, Zoph, Barret, Spiridonov, Alexander, Sepassi, Ryan, Dohan, David, Agrawal, Shivani, Omernick, Mark, Dai, Andrew M., Pillai, Thanumalayan Sankaranarayana, Pellat, Marie, Lewkowycz, Aitor, Moreira, Erica, Child, Rewon, Polozov, Oleksandr, Lee, Katherine, Zhou, Zongwei, Wang, Xuezhi, Saeta, Brennan, Diaz, Mark, Firat, Orhan, Catasta, Michele, Wei, Jason, Meier-Hellstern, Kathy, Eck, Douglas, Dean, Jeff, Petrov, Slav, Fiedel, Noah

arXiv.org Artificial IntelligenceOct-5-2022

Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2204.02311

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(26 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Health & Medicine (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Add feedback

Deci's NLP Model Achieves Breakthrough Performance at MLPerf

#artificialintelligenceSep-8-2022, 12:48:26 GMT

TEL AVIV, Israel, Sept. 8, 2022 -- Deci, the deep learning company harnessing Artificial Intelligence (AI) to build better AI, announced results for its Natural Language Processing (NLP) inference model submitted to the MLPerf Inference v2.1 benchmark suite under the open submission track. Generated by Deci's Automated Neural Architecture Construction (AutoNAC) technology, the NLP model, dubbed DeciBERT-Large, ran on Dell-PowerEdge-R7525-2 hardware using the AMD EPYCTM 7773X processor. The resulting model outperformed both the throughput performance of the BERT-Large model by 6.46x and achieved a 1% boost in accuracy. The model was submitted under the offline scenario in MLPerf's open division in the BERT 99.9 category. The goal was to maximize throughput while keeping the accuracy within a 0.1% margin of error from the baseline, which is 90.874

breakthrough performance, deci, hardware, (12 more...)

#artificialintelligence

Country: Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.26)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance

#artificialintelligenceApr-13-2022, 16:41:18 GMT

Posted by Sharan Narang and Aakanksha Chowdhery, Software Engineers, Google Research In recent years, large neural networks trained for l...

breakthrough performance, pathway language model, scaling

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

How Supercomputers Help To Create The Next Generation of Fully Integrated Data Centres

#artificialintelligenceOct-10-2020, 23:35:35 GMT

"Data centre is an asset that needs to be protected"- Michael Kagan, CTO of NVIDIA On the first day of the NVIDIA GPU Technology Conference, Jensen Huang, founder of NVIDIA revealed the company's three-year DPU roadmap that featured the new NVIDIA BlueField-2 family of DPUs and NVIDIA DOCA software development kit for building applications on DPU-accelerated data centre infrastructure services. Michael Kagan, CTO of NVIDIA recently in a talk, explained the next generation of fully integrated data centres and how supercomputers and edge AI helps in augmenting such initiatives. Kagan stated that the state-of-the-art technologies from both NVIDIA and Mellanox created a great opportunity to build a new class of computers, i.e. the fully-integrated cloud data centres that are designed to handle the workload of the 21st century. Historically, servers were the unit of computing, But eventually, Moore's law has slowed down as the performance of CPUs could not keep up the workload demands. According to Kagan, with the revolution of Cloud AI and edge computing, instead of a single server, the entire data centre has become the new unit of computing designed to handle parallel workloads.

cloud computing, data centre, machine learning, (20 more...)

#artificialintelligence

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.74)
(2 more...)

Add feedback

Intel Unveils Strategy for State-of-the-Art Artificial Intelligence

#artificialintelligenceNov-30-2016, 14:00:13 GMT

Intel announces AI strategy to drive breakthrough performance, democratize access and maximize societal benefits. Intel introduces industry's most comprehensive data center compute portfolio for AI: the new Intel Nervana platform. Intel aims to deliver up to 100x reduction in the time to train a deep learning model over the next three years compared to GPU solutions. Intel reinforces commitment to an open AI ecosystem through an array of developer tools built for ease of use and cross-compatibility, laying the foundation for greater innovation. Intel announces AI strategy to drive breakthrough performance, democratize access and maximize societal benefits.

artificial intelligence, deep learning, machine learning, (14 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > Canada > Quebec > Montreal (0.05)

Industry:

Information Technology > Services (0.72)
Education > Educational Setting (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback